AITopics | pvalue 0

Beyond Average Performance -- exploring regions of deviating performance for black box classification models

Torgo, Luis, Azevedo, Paulo, Areosa, Ines

arXiv.org Machine LearningSep-16-2021

Machine learning models are becoming increasingly popular in different types of settings. This is mainly caused by their ability to achieve a level of predictive performance that is hard to match by human experts in this new era of big data. With this usage growth comes an increase of the requirements for accountability and understanding of the models' predictions. However, the degree of sophistication of the most successful models (e.g. ensembles, deep learning) is becoming a large obstacle to this endeavour as these models are essentially black boxes. In this paper we describe two general approaches that can be used to provide interpretable descriptions of the expected performance of any black box classification model. These approaches are of high practical relevance as they provide means to uncover and describe in an interpretable way situations where the models are expected to have a performance that deviates significantly from their average behaviour. This may be of critical relevance for applications where costly decisions are driven by the predictions of the models, as it can be used to warn end users against the usage of the models in some specific cases.

ant sup 0, pvalue 0, subgroup, (15 more...)

arXiv.org Machine Learning

2109.08216

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
Europe > Portugal > Porto > Porto (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Transportation > Air (0.81)
Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Geometric entropy minimization (GEM) for anomaly detection and localization

Hero, Alfred O.

Neural Information Processing SystemsDec-31-2007

We introduce a novel adaptive nonparametric anomaly detection approach, called GEM, that is based on the minimal covering properties of K-point entropic graphs when constructed on N training samples from a nominal probability distribution. Such graphs have the property that as N their span recovers the entropy minimizing set that supports at least ρ K/N(100)% of the mass of the Lebesgue part of the distribution. When a test sample falls outside of the entropy minimizing set an anomaly can be declared at a statistical level of significance α 1 ρ. A method for implementing this nonparametric anomaly detector is proposed that approximates this minimum entropy set by the influence region of a K-point entropic graph built on the training data. By implementing an incremental leave-one-out k-nearest neighbor graph on resampled subsets of the training data GEM can efficiently detect outliers at a given level of significance and compute their empirical p-values. We illustrate GEM for several simulated and real data sets in high dimensional feature spaces.

anomaly detector, knng, training sample, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.67)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)

Add feedback

Geometric entropy minimization (GEM) for anomaly detection and localization

Hero, Alfred O.

Neural Information Processing SystemsDec-31-2007

We introduce a novel adaptive nonparametric anomaly detection approach, called GEM, that is based on the minimal covering properties of K-point entropic graphs when constructed on N training samples from a nominal probability distribution. Such graphs have the property that as N their span recovers the entropy minimizing set that supports at least ρ K/N(100)% of the mass of the Lebesgue part of the distribution. When a test sample falls outside of the entropy minimizing set an anomaly can be declared at a statistical level of significance α 1 ρ. A method for implementing this nonparametric anomaly detector is proposed that approximates this minimum entropy set by the influence region of a K-point entropic graph built on the training data. By implementing an incremental leave-one-out k-nearest neighbor graph on resampled subsets of the training data GEM can efficiently detect outliers at a given level of significance and compute their empirical p-values. We illustrate GEM for several simulated and real data sets in high dimensional feature spaces.

anomaly detector, knng, training sample, (14 more...)

Neural Information Processing Systems

Country: